Speech Emotion Recognition Based on Deep Residual Shrinkage Network
نویسندگان
چکیده
Speech emotion recognition (SER) technology is significant for human–computer interaction, and this paper studies the features modeling of SER. Mel-spectrogram introduced utilized as feature speech, theory extraction process mel-spectrogram are presented in detail. A deep residual shrinkage network with bi-directional gated recurrent unit (DRSN-BiGRU) proposed paper, which composed convolution network, unit, fully-connected network. Through self-attention mechanism, DRSN-BiGRU can automatically ignore noisy information improve ability to learn effective features. Network optimization, verification experiment carried out three emotional datasets (CASIA, IEMOCAP, MELD), accuracy 86.03%, 86.07%, 70.57%, respectively. The results also analyzed compared DCNN-LSTM, CNN-BiLSTM, DRN-BiGRU, verified superior performance DRSN-BiGRU.
منابع مشابه
Speech Emotion Recognition Using Scalogram Based Deep Structure
Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...
متن کاملSpeech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions
Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...
متن کاملSpeech Emotion Recognition Based on Sparse Representation
Speech emotion recognition is deemed to be a meaningful and intractable issue among a number of domains comprising sentiment analysis, computer science, pedagogy, and so on. In this study, we investigate speech emotion recognition based on sparse partial least squares regression (SPLSR) approach in depth. We make use of the sparse partial least squares regression method to implement the feature...
متن کاملEfficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
We present a new implementation of emotion recognition from the para-lingual information in the speech, based on a deep neural network, applied directly to spectrograms. This new method achieves higher recognition accuracy compared to previously published results, while also limiting the latency. It processes the speech input in smaller segments – up to 3 seconds, and splits a longer input into...
متن کاملImproving of Feature Selection in Speech Emotion Recognition Based-on Hybrid Evolutionary Algorithms
One of the important issues in speech emotion recognizing is selecting of appropriate feature sets in order to improve the detection rate and classification accuracy. In last studies researchers tried to select the appropriate features for classification by using the selecting and reducing the space of features methods, such as the Fisher and PCA. In this research, a hybrid evolutionary algorit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2023
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics12112512